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Abstract 

The  language  of  partial  order  time  expresses  the  issues  central  to  many  problems  in  asyn¬ 
chronous  distributed  systems.  A  secure  partial  order  time  service  would  provide  a  general 
method  to  develop  secure  protocols  for  these  problems.  In  this  paper,  we  sketch  out  these 
issues  and  develop  one  such  protocol:  signed  vector  timestamps.  The  majority  of  this  paper 
is  drawn  verbatim  from  the  first  author’s  October  1991  thesis  proposal,  the  first  rescarefs 
into  security  issues  for  non-scalar  time  services  and  the  original  presentation  of  the  SV'T 
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1.  Introduction 


The  language  of  partial  order  time  expresses  the  issues  central  to  many  problems  in  asyn 
chronous  distributed  systems.  A  secure  partial  order  time  service  would  provide  a  general 
method  to  develop  secture  protocob  for  these  problems.  In  this  paper,  we  sketch  out  liiese 
issues  and  develop  one  such  protocol:  signed  vector  timestamps.  This  paper  b  drawn  ver¬ 
batim  from  the  first  author’s  October  1991  thesis  proposal*  [20j,  except  for  minor  edits,  the 
concluding  Sections  b  and  6,  and  this  paragraph.  The  original  proposal  document  gi^'es  the 
first  research  into  seanrity  issues  for  noD’Scalar  time  services  and  the  original  presentation 
of  the  SVT  protocol.  (It  has  recently  come  to  our  attention  that  this  protocol  was  later 
independently  rediscovered.  (18j) 

Traditionally,  we  regard  time  as  a  scalar  value,  totally  ordering  on  the  events  in  a  system. 
However,  the  very  nature  of  asynchronous  distributed  systems  suggests  that  we  should  use 
an  order  that  is  partial,  not  total,  so  that  we  can  deliberately  leave  unordered  two  separated 
events  that  have  no  knowledge  of  each  other.  In  this  partial  order  lime  model,  both  the 
presence  and  the  absence  of  a  path  between  two  events  carry  meaning — whether  one  event 
necessarily  precedes  the  other,  or  whether  they  are  concurrent.  If  we  use  merely  a  total 
order,  we  lose  the  latter  information. 

Many  problems  in  distributed  systems  reduce  to  questions  about  this  partial  order.  Our 
current  research  explores  building  tools  that  explicitly  grant  these  abilities,  thus  providing 
a  general  method  to  develop  protocols  to  solve  problems  in  this  class — known  problems 
that  currently  have  separate  ad  hoe  solutions,  and  also  new  problems  that  arise  from  this 
unified  framework.  Our  research  also  explores  making  these  tools  robust  for  various  models 
of  Byzantine  failure  and  information  confinement;  thus,  protocols  based  on  these  tools  will 
be  secure  and  robust,  since  they  will  inherit  the  security  properties  already  present  in  the 
toolkit. 


2.  Partial  Order  Time 


Partial  order  time  provides  an  alternative  way  to  order  events  in  an  a-synchronous  distributed 
system.  The  goal  of  the  first  author’s  thesis  [21  j  is  to  design  a  family  of  protocols  that  allow 
processes  in  a  system  to  examine  local  events  in  terms  of  this  time  model. 

The  concept  of  partial  order  time  solves  some  of  the  difficulties  introduced  by  merging 
independent  timelines  into  the  same  totally  ordered  stream.  Using  only  a  partial  order  on 
events  lets  us  ensure  that  event  a  happens  “after”  event  b  if  and  only  if  a  can  observe  the 
results  of  b — total  orders  only  allow  the  converse  direction.  Deliberately  leaving  unordered 
two  events  that  lie  outside  each  other’s  “observation  cone”  frees  us  from  the  paradoxes  of 
conflicting  knowledge  horizons. 


*The  proposal  document  is  available  by  request  from  the  School  of  Ck>mputer  Science,  Carnegie  Mellon 
University,  and  also  by  ftp  on  lunch. trust.es. cm. sdn  as  /usr/s»ith/public/PROPOSAL.pa. 
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A  total  order  <  is  consistent  with  partial  order  -<  when  a  <  b  =>  a  <  b.  (If  we  ihiuk 
of  orders  as  a  set  of  ordered  pairs,  than  a  consistent  total  order  is  just  a  total  order  that 
contains  the  partial  order  as  a  subset.)  Any  partial  order  e.xtends  to  a  consistent  total  order, 
further,  the  set  of  consistent  total  orders  uniquely  charzurterizes  a  partial  order.  Research  on 
concurrent  systems  raises  ideas  of  partial  order  time  precisely  because  of  the  need  to  reason 
about  this  entire  set.  Total  order  time — even  the  total  order  provided  by  real  lime  -  provides 
only  one  member. 


2.1.  Formal  Definitions 

We  base  our  partial  order  time  model  on  Lamport’s.  {1  Ij  Formally,  let  us  define  an  event  to 
be  an  instantaneous,  atomic  action  within  a  system  (as  per  Mattern  [14]).  Each  event  lakes 
place  at  one  specific  process.  We  partition  events  into  three  categories; 

1.  send  events,  in  which  one  process  sends  a  message  to  another 

2.  receive  events,  in  which  one  process  receives  a  message  from  another 

3.  internal  events — anything  else  that  happens  within  a  process 

Send  events  take  place  at  the  sending  process;  receive  events  at  the  receiving  process.  Note 
that  since  communication  is  asynchronous,  a  send  event  does  not  have  to  be  simultaneous 
with  a  receive  event;  depending  on  the  faulure  model  we  use,  a  send  event  may  not  even  have 
a  corresponding  receive. 

Isolating  each  event  in  a  distributed  system — e.g.,  requiring  each  proceed  to  throw  away 
its  state  after  each  event — would  render  irrelevant  any  discussion  of  event  ordering.  Only 
when  events  can  observe  the  results  of  previous  events  does  the  issue  arise  of  deciding  which 
events  are  indeed  “previous."  To  capture  this  notion  of  “previous,"  we  will  construct  th.c 
basic  partial  order  (BPO)  on  events:  we  will  write  a — *b  to  indicate  the  event  b  potentially 
depends  on  event  a:  that  is,  event  a  must  be  in  the  past  in  the  timeline  c.xperienced  by  b. 
One  interpretation  of  the  BPO  is  that  it  expresses  the  basic  flow  of  causality;  a  less  mystical 
interpretation  is  that  it  specifies  some  minimum  required  level  of  structure  in  possible  lime 
sequences. 

To  define  the  BPO,  we  proceed  from  two  basic  rules: 

•  Recall  that  we  assume  that  uniprocessors  can  totally  order  their  own  events.  If  events 
a  and  b  occur  on  the  same  process  and  a  precedes  b  in  this  order,  then  let  a — ^b. 

•  Processes  only  influence  other  processes  by  sending  messages,  and  are  influenced  only 
by  receiving  them.  So,  if  a  is  the  sending  of  a  message  and  6  is  reception  of  that 
message,  then  a — >6. 

Formally,  we  let  the  BPO  be  the  transitive  closure  of  this  relation.  Note  that  for  two 
events  o  and  b,  exactly  one  of  three  cases  holds. 
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•  a — »6:  6  depends  on  a 
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•  a_yL»5  and  b-/-*a;  in  this  case,  we  say  that  a  auid  6  are  concurrent.  i 
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We  will  write  a  <-7^  b  to  indicate  the  latter  relationship. 

2.2.  The  Graph  Interpretation 


D>  / 


Ai/siiaOilify  DxJes 


j  Avaif  tj'iUiOr 


Interpreting  the  BPO  as  a  directed  acyclic  graph  (DAG)  makes  discussing  some  of  its  proper 
ties  easier.  Construct  a  node  for  each  event  in  the  system,  amd  draw  directed  edges  according 
to  the  two  basic  rules  above.  Then  the  relation  a — >6  holds  exactly  when  a  path  exists  from 
a  to  b. 

Regarding  the  BPO  as  a  graph — without  transitive  closure — -allows  us  two  different  ways 
to  define  restrictions  on  a  BPO.  Let  5  be  a  subset  of  the  events  (perhaps  those  c’vent.s 
occurring  at  some  subset  of  the  processes). 


•  We  construct  the  nontransitive  restriction  of  the  BPO  to  S  simply  by  deleting  ail  nodes 
not  in  S,  and  all  edges  incident  to  these  nodes. 

•  We  construct  the  transitive  restriction  of  the  BPO  by  first  taking  the  transitive  closure 
of  the  graph,  and  then  deleting  the  nodes  and  edges  not  in  5.  This  is  the  standard 
restriction  for  partial  orders. 

s 

We  will  use  the  notation  a — >b  to  indicate  that  event  6  depends  on  a  under  the  nontransitive 
restriction  of  the  BPO  to  5,  and  a — >6  under  the  transitive  restriction 


3.  Secure  Clocks  for  Partial  Order  Time 

This  paper  proposes  a  secure  toolkit  for  distributed  partial  clocks.  We  now  offer  a  more 
detailed  discussion  on  what  we  mean  by  this — Section  3.1  presents  the  basic  issues  involved 
in  defining  these  clocks,  and  Section  3.2  examines  security  and  robustness  issues. 


3.1.  Clocks  for  Partial  Order  Time 

The  problem  of  robustly  implementing  a  traditional  clock  on  a  distributed  system  (where  by 
“clock"  we  refer  to  a  global  event  counter,  although  some  ideas  extend  to  approximations  of 
real  time)  is  difficult  but  solvable  (e.g.,  {12j,{13),f22)).  Researchers  observe  that  a  necessary 
condition  for  distributed  clocks  is  that  the  total  order  calculated  be  consistent  with  the  BPO. 
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That  is,  the  system  computes  a  time  function  T,  mapping  events  to  integer  timestamps,  such 
that  for  all  events  o,6, 

T{a)<T(b)  =>  a—^b 

However,  we  stress  the  importance  of  a  system  being  able  to  calculate  the  BPO  exactly. 
Onr  goal  is  to  implement  a  distributed  partial  clock  we  want  a  timestamp  set,  with  partial 
order  -<  ;  a  function  T  from  events  to  timestamps  satisfying 

T{a)  -<  T{b)  <=>  0—6; 

and  the  ability  for  processes  within  the  distributed  system  to  compute  the  function  T  and 
the  comparison  -<  . 

More  precisely,  we  want  our  partial  clock  toolkit  to  enable  processes  to  be  able  to  calculate 
these  functions  for  the  events  they  know  about:  process  F.  need  only  calculate  T  and  -<  on 
some  subset  Ei  containing  the  events  perceivable  by  F,.  Defining  this  notion  is  a  bit  tricky. 
The  weakest  nontrivial  definition  follows: 

•  if  event  a  occurs  at  F,,  then  a  €  Ei 

•  if  event  a  is  the  sending  of  a  message  to  F,,  which  F,  received,  then  a  €  F, 

Note  that  this  is  nontrivial  because,  if  events  a,  6  are  the  sending  of  messages  to  process  F, 
and  event  c  is  internal,  then  answering  the  questions  of  whether  a — >6  or  c — >0  may  require 
information  not  easily  available  to  this  process.  This  definition  is  still  rather  weak:  suppose 
the  message  sent  to  process  F,  in  event  a  at  process  Fj  contains  information  about  events 
preceding  a?  One  could  argue  that  Pi  ought  to  be  able  order  those  event  too.^  Further, 
suppose  that  event  b  is  the  reception  at  process  Pj  of  message  m  sent  by  event  a  at  F,. 
Should  6  €  Ei?  Clearly  Pi  knows  that  its  event  a  influenced  b — but  does  F,  necessarily  know 
that  b  exists? 

In  the  spirit  of  saying  that  our  definition  of  BPO  is  purely  syntactic,  we  claim  that  this 
weak  definition  of  Ei  is  the  corresponding  purely  syntactic  version.  As  with  BPO,  we  can 
construct  more  complicated  extensions  of  this  basic  concept  by  considering  other  issuc.s. 

3.2.  Security  Issues 

The  problems  of  robustness  and  security  in  distributed  partial  clocks  take  two  forms:  fault 
tolerance,  amd  some  special  challenges  the  nature  of  partial  clocks  creates  for  information 
confinement. 


*Thw  fact — that  the  BPO  is  transitive  but  this  notion  of  “perceiv»bility”  is  not — will  cause  problems 
when  consider  information  confinement  (in  Section  3.2). 
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3.2.1.  Fault  Tolerance 


A  natural  question  to  ask  when  considering  a  distributed  system  that  consists  of  a  physically 
distributed  collection  of  machines  is:  what  happens  when  one  of  them  goes  awry?  In  our 
distributed  systems  model  we  have  several  elements: 

•  physical  processors 

•  communication  links  between  processors 

•  processes  running  on  processors 

Physical  machines  can  fail  (either  gracefully  or  maliciously);  processes  can  be  downright 
malevolent;  processes  go  into  suspension  while  their  machine  is  down,  or  when  they  move 
operation  to  a  different  machine;  communication  links  can  deliver  messages  out  of  order,  or 
garbled,  or  not  at  all. 

(In  the  remainder  of  this  paper,  we  make  the  simplifying  assumptions  that  each  process 
resides  on  its  on  processor,  and  that  the  network  never  corrupts  messages. ) 

We  would  like  our  distributed  partial  clocks  to  maintain  some  kind  of  reasonable  perfor¬ 
mance  in  the  face  of  such  troubles.  We  can  imagine  the  standard  spectra  measuring  severity 
of  individual  failures  and  number  of  such  failures,  with  a  family  of  implementations  that 
achieve  increasing  levels  of  pcrformMce  on  these  spectra,  probably  by  trading  off  again.st 
simplicity  and  efficiency,  and  by  balancing  the  various  types  of  robustness. 

However,  a  new  issue  is  exactly  what  we  should  regard  as  “reasonable  performance."  The 
functions  we  wish  our  clocks  to  calculate  capture  distributed,  global  properties.  Cven  though 
events  a  and  b  might  occur  in  the  immediate  proximity  of  a  process  P,  (e.g.,  in  the  weakest 
£,),  the  individuaJ  arcs  in  the  BPO  graph  that  cause  a — >6  to  hold  might  be  distributed 
throughout  the  entire  system.  We  could  require  the  nonfaulty  processes  to  calculate  the 
BPO  correctly  on  their  perceivable  events;  less  strongly,  we  could  restrict  these  e%ents  to 
those  belonging  to  nonfaulty  processes  (so  we  absolve  nonfaulty  P,  from  any  confusion  tiiat  a 
message  from  a  faulty  process  causes).  Some  of  our  work  already  suggests  even  weaker  fault 
tolerance;  requiring  nonfaulty  processes  only  to  calculate  the  nontransitive  restriction^  of  the 
BPO  to  the  set  of  nonfaulty  processes.  Each  of  these  cases  partitions  the  set  of  processes,  and 
hence  the  set  of  events,  into  nonfaulty  and  faulty  categories,  but  only  specifies  how  events  in 
the  former  should  be  handled.  How  nonfaulty  processes  should  deal  with  bad  events  raises 
another  set  of  research  questions. 


3.2.2.  Information  Confinement 

To  illustrate  another  set  of  security  issues,  we  now  consider  an  especially  naive  implemen¬ 
tation  of  partial  clocks.  Suppose  a  distributed  system  explicitly  maintains  the  BPO  graph. 


^Recall  the  definition  in  Section  2.2. 
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After  initialization,  eadi  process  starts  building  a  linear  chain  of  its  internal  events.  Wlien 
sending  a  message,  a  process  sends  along  its  chain;  when  receiving  a  message,  a  process 
incorporates  the  graph  information  contained  into  its  own  graph.  Consequently,  whenever 
a  process  executes  an  event,  it  knows  the  entire  BPO  subgraph  induced  by  taking  all  the 
ancestors  of  that  event.  This  implementation  allows  processes  to  calculate  the  T  and  -< 
relations.  However,  even  aside  from  questions  of  efficiency  and  fault  tolerance,  this  im¬ 
plementation  would  be  unsatisfau^tory  in  two  crucial  ^ueas:  reasons  of  security  policy  and 
reasons  of  innate  causality  may  render  it  undesirable  or  impossible  for  a  process  to  know  the 
complete  history  behind  every  event. 


Confinement  by  policy.  Recall  in  Section  3.1  we  offered  a  weakest  definition  of  the  events 
perceivable  by  a  process:  Ei,  consisting  of  the  events  internal  to  process  P,  and  the  send 
events  of  messages  received  by  process  Pi.  In  many  real  instances  of  distributed  systems 
we  may  want  to  enforce  an  information  confinement  rule  such  as  “process  P,  can  know 
nothing  of  the  global  BPO  graph  except  its  transitive  restriction  to  E,,  unless  authorization 
is  explicitly  granted  in  some  way.” 

For  exjunple,  consider  distributed  workstations  in  a  university  environment.  Just  becau.se 
Alice  sends  a  message  to  Bob  does  not  mean  Bob  has  the  right  to  know  everything  Alice 
has  been  doing.  We  need  to  consider  confinement  from  the  future  as  well:  professors  Bob 
and  Carla  may  need  to  have  a  lengthy  discussion  of  student  Alice’s  proposal — but  naturally 
Alice  should  not  be  privy  to  this  discussion,  or  even  to  the  fact  that  “a  lengthy  discussion 
of  my  proposal  is  going  on.” 

We  formalize  these  concepts  by  introducing  two  new  terms: 

•  forward  confinement:  keeping  private  information  about  a  process  from  leaking  to 
processes  it  influences  in  the  BPO 

•  backward  confinement:  keeping  private  information  about  a  process  from  leaking  to 
processes  that  have  influenced  it 

Enforcing  principles  of  forward  and  backward  information  confinement  raises  some  inter¬ 
esting  implementation  challenges.  Let  a  be  a  send  event  at  process  Pj ,  and  let  events  b  at  P2 
and  c  at  P3  be  in  the  future  of  a  (that  is,  a — >b  and  a — *c)  and  suppose  processes  P2  and  P3 
need  to  know  different  details  of  the  history  of  a  in  order  to  timestamp  6  and  c,  respectively. 
Forward  confinement  requires  that  P\  not  transmit  this  information  with  a.  But  backward 
confinement  requires  that  Pj  and  P3  cannot  just  query  Pi! 


Confinement  by  structure.  Confinement  principles  are  just  that — principles  we  impose 
for  reasons  external  to  the  basic  problem  of  tracing  causality.  However,  some  common 
system  mechanisms  create  information  barriers  that  fundamentally  affect  this  basic  problem. 
Suppose  student  Alice  sends  an  anonymous  suggestion  to  the  suggestion  box  maintained  by 
Professor  Bob  for  his  class,  who  acts  on  this  suggestion.  Bob’s  actions  depend  on  Alice’s 
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suggestion — but  he  cannot  know  whose  action  this  suggestion  is.  Further,  the  suggestion  is 
not  completely  anonymous,  for  in  her  later  interactions  with  Bob,  Alice  knows  that  Bob’s 
actions  follow  from  her  actions.  Greif  [7]  calls  this  the  phenomenon  of  hidden  causality,  and 
gives  a  more  fundamental  example:  the  relation  between  V  and  P  operations  on  a  binary 
semaphore. 

How  to  resolve  the  problem  of  hidden  causality  in  a  distributed  partial  clock  is  amother 
research  issue  we  intend  to  explore.  We  may  need  to  extend  the  BPO  formalism  to  make  it 
sufficiently  rich  to  express  ail  these  nuances. 


4.  The  SVT  Protocol 
4.1.  Overview 

The  central  issue  in  building  a  secure  distributed  partial  clock  toolkit  is  how  to  keep  track 
of  the  partial  order.  Essentially,  our  BPO  is  a  dynamically  changing  directed  acyclic  graph 
whose  behavior  meets  the  following  criteria: 

•  Monotonicity.  As  [real]  time  progresses,  edges  and  nodes  are  added.  In  the  basic 
problem,  nothing  is  deleted. 

•  Distribution.  New  nodes  originate  from  individual  processes  within  a  distributed 
system;  new  edges  from  either  individual  processes  or  (in  the  case  of  message  trans¬ 
mission)  from  pairs  of  processes. 

Our  toolkit  needs  to  allow  individi’cd  processes  to  answer  connectivity  queries  about  this 
graph,  and  hence  must  maintain  this  graph,  at  least  in  some  virtual  form.  The  distributed 
nature  of  the  DAG  forces  processes  to  require  nonlocal  information  in  order  to  answer  these 
queries.  The  issue  of  how  and  when  this  information  should  propagate — piggybacked  on 
system  messages,  or  transmitted  only  when  requested  by  a  query — delineates  one  a.xis  of 
possible  implementation  approaches. 

In  this  section  we  outline  a  starting  point  for  our  implementation  work;  signed  vector 
timestamps  (SVTs).  This  approach  falls  at  the  “piggyback”  end  of  this  axis.  The  SVT 
protocol  extends  Lamport  event  counters  to  provide  an  implementation  of  distributed  partial 
clocks  that  is  moderately  robust  against  Byzantine  failure.  We  conjecture  that  this  may  be 
the  best  protection  possible  if  we  disallow  any  special  underlying  computational  structure. 

However,  this  initiad  approach  offers  two  principal  drawbacks:  ineffectiveness  at  enforcing 
forward  confinement,  and  computational  inefficiency  in  certain  scenarios.  Analyzing  these 
drawbacks  suggests  several  new  directions  for  implementation  research. 

We  begin  by  discussing  Lamport  clocks  (Section  4.2),  then  extend  them  to  vectors  (Sec¬ 
tion  4.3),  and  then  turn  to  SVTs:  the  protocol,  its  problems,  and  the  new  research  avenues 
suggested  (Sections  4.4  and  4.5). 
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4.2.  Lamport  Clocks 

Lamport  [11]  discusses  the  issue  of  determining  the  BPO  and  presents  an  elegant  partial 
solution  using  local  event  counters.  Timestamps  sent  along  with  every  message  keep  the 
local  coxmters  roughly  synchronized,  and  capture  a  total  order^  consistent  with  the  BPO. 

Formally,  each  process  Pi  mauntains  a  local  scalar  clock  Ci.  Process  P,  marks  each  event 
a  that  occurs  there  and  each  message  m  it  sends  with  a  timestamp  C{a)  (or  C(m)),  which 
reflects  the  current  value  of  the  clock  C,.  This  current  value  chamges  with  each  event  a  at 
Pi;  the  type  of  event  determines  the  change. 


a  is  internal 

Cia)^Ci 

Cii-Ci  -1- 1 

a  is  sending  of  message  m 

C{a)ir-Ci 

C{Tn)*~Ci 

Ci* — Ci  -}- 1 

a  is  reception  of  m 

C,<— max  (C,,C(m)  +  1} 
C{a)^C, 

Ci* — Ci  +  1 

These  timestamps  order  events  consistently  with  the  BPO: 

Theorem  1  For  all  events  a,b,  if  a — *b  then  C(a)  <  C(6). 

However,  this  method  has  two  principal  drawbacks — it  only  produces  a  total  order  (the 
converse  to  Theorem  1  does  not  hold),  and  it  is  egregiously  unsecure,  as  each  process’s  clock 
is  essentiadly  world- writable.  For  example,  suppose  process  P,  haa  C,  =  s  and  receives  a 
message  m  from  process  Pj  with  C(m)  —  t  s.  Ostensibly,  the  timestamp  i  testifies  that 
at  least  t  —  s  events  have  occurred  in  the  outside  world  since  p  last  received  a  message. 
But  Pi  cannot  distinguish  this  presumed  scenario  from  one  where  malicious  process^  P, 
arbitrarily  inflates  the  timestamp.  After  all,  such  maliciousness  offers  advantages:® 

•  If  P,  lacks  a  “sensibility  check”  on  its  timestamps  but  plans  to  interact  with  process 
Pk  that  does,  then  P/s  action  causes  P*  to  erroneously  identify  Pi  as  faulty. 

•  If  processes  store  timestamps  as  a  fixed-length  word  with  maximum  value  /V,  then  Pj 
could  use  t  =  N  —  1  and  cause  P,  to  roll  over,  either  making  Pi  appear  faulty  or  causing 
dangerous  anachronism. 


^Strictly  speaking,  it  produces  a  partial  order,  as  events  at  two  processes  could  receive  the  same  value 
timestamp.  But  we  can  easily  linearize  this  order  by  choosing  a  linear  order  on  the  processes  and  using  that 
order  to  break  ties. 

*We  oversimplify  here — consider  that  Pj  itself  may  only  be  the  last  link  in  a  chain  of  honest  processes 
unwittingly  passing  on  bogus  information  introduced  by  the  malicious  process. 

®  Again,  actual  scenarios  may  be  even  more  complex:  Pi  may  be  just  a  'ink  in  a  chain  to  reach  the  intended 
victim  process. 
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•  If  processes  store  timestamps  as  unbounded  values,  P,  could  still  increase  by  orders 
of  magnitude  the  number  of  words  Pi  uses  for  its  clock.  This  both  slows  down  P,’s 
dealings  with  its  neighbors,  and  allows  Pj  to  observe  the  spread  of  its  influence — a 
violation  of  backward  confinement. 

•  If  Pj  interacts  with  most  processes  fairly  regularly,  then  it  can  render  the  entire  clock 
system  effectively  useless  by  blowing  up  every  timestamp  with  each  message. 

4.3.  Extending  Lamport  Clocks  to  Vectors 

Our  SVT  implementation  extends  Lamport  counters  by  making  timestamps  vectors  iiistead 
of  scalars,  and  incorporating  digital  signatures.  These  extensions  rectify  the  cited  drawbacks. 

In  the  vector  timestamp  protocol,  processes  maintain  a  vector  indicating  their  '^knowledge 
horizon” — the  most  recent  event  they  (syntactically)  know  about  at  each  other  process. 
(Technically,  we  should  note  that  this  structure  is  not  so  much  a  vector  but  an  indexed  set: 
the  length  need  not  be  fixed,  nor  the  indices  known  a  priori.  This  raises  some  interesting 
research  questions  regarding  what  to  do  with  lost  or  missing  members.)  The  SVT  protocol 
extends  this  by  using  public  key  decryption  to  authenticate  these  timestamps. 

The  vector  timestamp  protocol  exactly  captures  the  BPO.  The  SVT  protocol  even  allows 
the  set  of  honest  processes — ’no  matter  how  few — to  calculate  the  nontransitive  restriction 
of  the  BPO  despite  any  action  whatsoever  by  malicious  processes.  The  concept  of  using 
dependency  vectors  without  authentication  surfaces  in  earlier  research  (e.g.,  [23],  [15],  [5], 
[6j),  but  this  paper  is  the  first  to  consider  these  vectors  as  an  implementation  for  a  general 
purpose,  secure  partial  clock  toolkit. 

The  remainder  of  this  section  presents  the  basic  protocol,  and  Sections  4.4  and  4.5  add 
authentication. 


4.3.1.  The  Vector  Timestamp  Protocol 

We  begin  by  discussing  the  basic  protocol,  without  authentication.  Let  n  be  the  number 
of  processes.  Each  process  P,  maintains  a  local  clock  Ci,  an  event  counter.  Each  process 
also  maintains  an  n-eleir.ent  vector  K  to  keep  track  of  the  most  recent  event  it  knows  about 
at  every  other  process.  We  will  use  the  notation  K(ji)  to  refer  to  the  jth  component  of 
vector  V{ — ’this  component  reflects  process  P’s  most  current  knowledge  of  process  Pj.  We 
can  dispense  with  Ci  altogether,  and  just  store  the  value  as  Vi(*)-  Let  each  component  of 
each  Vj  be  zero  initially. 

Each  process  will  timestamp  its  events  and  outgoing  messages  with  an  n-element  vec¬ 
tor.  To  follow  our  previous  notation  strictly,  we  should  denote  these  timestamps  by  V(a}; 
however,  to  make  component  indexing  'easier,  we  will  use  subscripting  instead:  K  is  the 
timestamp  on  event  a,  Vm  on  message  m.  The  following  table  outlines  how  processes  obtain 
these  timestamp  vectors  and  update  their  own  vectors.  Let  event  a  occur  on  process  P,. 
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a  is  internal 

K(»)-K(»)  + 1 

a  is  sending  of  message  m 

Vi{i)*-Vi{i)  +  1 

Va*-Vi 

Viii)*^Vi{i)  -f  1 

a  is  reception  of  m 

Vi  4  *  Vi(i)«-max{Vi(i),  14 (i)} 
Vi{i)^Viii)  -h  1 

The  reason  for  the  two  increments  in  send  events  may  not  be  Intuitively  clear.  We 
increment  the  local  component  before  sending  a  message  so  that  the  receiving  process  can 
treat  all  components  equally  when  maximizing.  We  increment  again  so  that  the  subsequent 
event  at  the  sending  process  will  not  precede  the  receive  event. 

We  define  a  natural  ordering  on  the  timestamp  vectors. 

Definition  2  For  vectors  V,IV,  we  say  that  V  ~<W  when  Vi  V{i)  <  W[i)  and  3i  V[i)  < 
W{i). 

This  ordering  exactly  captures  the  BPO. 

Theorem  3  For  all  events  a,b,  a — >6  iffVa-<Vt,. 


4.3,2.  Security  Problems 

Consider  the  timestamp  vector  Vi  on  process  Pi.  It  is  "^rue  that  the  component?  Viij)  are 
world- writable  (for  i  ^  j)  in  the  sense  that  a  party  sending  Pi  a  message  can  force  these 
components  arbitrarily  high.  If  Pk  has  Vk{j)  —  42  (for  k,j,i  distinct),  then  Pk  can  send  a 
message  to  Pi  and  know  that  afterward,  Vi{j)  >  42.  If  Pk  is  malicious,  then  it  can  render 
the  vector  Vi  effectively  useless. 

But  assume  for  the  moment  that  everyone  is  honest.  Let  0  be  the  initial  value  of  all 
vector  components.  Process  Pk  can  change  a  component  of  its  vector  in  only  two  ways:  it 
can  increment  its  own  component 

Vk{k)*-Vk{k)  -h  1 

or  it  can  copy  other  components  from  incoming  messages 

Vkij)*-ma.x{Vn{j),  Vi(j)} 

The  vector  on  a  message  is  just  a  copy  of  the  vector  at  the  sending  process.  Hence  we  can 
observe: 
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Theorem  4  Let  u  be  an  event  on  process  P^,  and  let  j  f-  % .  I  hen  X'Jj)  t>  either  J  a 
copy  o/Vfnij),  v:hert  m  is  a  message  sent  by  event  b  at  prxn-ess  F,,  and  b~^a 


So  processes  aow  have  some  meams  of  cieteciui|;  when  >ufurHj{ic  is  srndiisg  thrjn  i  ^ 
information  in  a  message’s  timestamp:  they  know  that  earh  nujssrro  iumiKJuent  >  f  ij.r 
timestamp  should  have  been  originally  generated  by  process  l\ 


4.4.  SVT:  Adding  Signatures  to  the  Vectors 

By  adding  signatures  to  the  vector  tuneslamp  scheme,  we  can  atid  lolrram  e  .'.gamst  Byz.\n 
tine  faults — arbitrary  behavior  by  arbitrary  numbers  of  processes. 

Let  us  assume  a  public  key  decryption  scheme,  where  for  any  r  eacii  process  I\  can 
generate  a  signature  £,(x)  such  that 

•  any  process  Pj  can,  given  i,i,  and  y,  quickly  determine  whether  y  -=  v.ir  ! 

•  for  j  ^  t,  any  finite  set  X ,  and  any  z  ^  X.  no  process  P,  can  calculate  £,i  :  .  ryr;i  :f 
it  has  an  oracle  for  f,  on  X . 

We  directly  extend  the  basic  vector  timestamp  protocol  to  produce  the  M'cure  prolocf>l 
SVT.  Namely,  we  just  include  and  check  signatures. 

Every  vector  V  will  now  have  two  fields  in  each  component— the  actual  value  V{i).  and 
the  signature  V'(j)'.  When  a  process  P,  sends  a  message  m.  it  sets 

v;(o'-^.(  v;{«) ) 

and  then  assigns  Kr,*~-V',.  When  a  process  P,  receives  a  message  m.  it  first  checks  tlic 
signatures 

V;  VAjY  =  €A  V,(j)  ) 

before  accepting  it.  If  P,  decides  to  copy  a  compon''nt  from  the  incoming  message 

KOl-VinO) 

then  Pi  copies  the  signature  as  well 

KOr-KnC;)' 


Let  H  be  the  set  of  honest  processes.  The  SVT  protocol  allows  honest  processes  to 
correctly  calculate  the  nontransitive  restriction  of  the  BPO. 


U 

Theorem  5  If  a — *b  then  14  -<  H. 
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In  the  other  direction,  we  can  show  something  a  bit  stronger/ 


Theorem  6  Let  events  a  and  b  occur  at  processes  P,  and  Pj.  Let  *6  //.  and  id  \\  havf 
proper  signatures.  If  then  a — *6. 


A  nice  thing  to  observe  about  SVT  is  that  honest  proctfsses  do  not  neeti  to  know  whu  h 
other  processes  are  honest. 


4.5.  Problems  with  the  SVT  Implementation 

The  SVT  protocol  has  several  drawbacks.  For  one  thing,  its  tolerance  of  Byzantine  failure 
is  not  ideal — the  “reasonable  performance"  it  achieves  falls  short  of  what  we  would  have 
desired.  We  suspect  that  this  behavior  may  be  inherent  for  this  style  of  implementation. 
Another  problem  is  that  the  amount  of  information  that  SVT  timestamps  contain  violates 
forward  confinement  and,  in  certain  situations,  might  be  rather  inefficient. 


4.5.1.  Lost  Influence 

In  Section  3  we  state  that  a  central  goal  of  this  work  is  to  discover  a  protocol  by  which  an 
honest  process  p  can  determine  the  BPO  among  Its  perceivable  events  E,.  The  SVT  protocol 
does  not  achieve  this  goal.  It  is  true  that  in  SVT,  a  malicious  process  cannot  overwrite  the 
clock  values  of  other  processes,  and  cannot  generate  arbitrarily  large  values  in  timestamp 
components  corresponding  to  honest  processes.  However,  the  protocol  does  permit  spoofing 
(in  the  sense  of  Herlihy  and  Tygar  [9j).  During  the  course  of  system  operation,  a  process 
will  receive  many  timestamp  pairs  z,£l,(i)  for  many  of  the  i.  A  process  is  supposed  to  u.sc 
the  largest  x  it  has  received  in  each  component,  but  it  can  use  any  other  one  it  wants  to. 

For  example,  suppose  Alice  and  the  Bank  are  honest,  but  Carla  is  pretty  nasty.  Suppose 
Alice  deposits  $10  in  her  previously  empty  bank  account,  and  then  gives  Carla  a  check  for 
$10.  Carla  can  rotl  back  all  her  timestamps  and  quickly  cash  the  check — and  the  Bank  would 
believe  that  Alice’s  request  depends  on  Carla’s,  and  thus  will  execute  Carla’s  first,  getting 
Alice  into  trouble. 

The  problem  remains  that  any  dealings  with  dishonest  or  faulty  processes  will  be  suspect. 
We  conjecture*  that  this  behavior  is  inherent  for  a  large  family  of  i.mplementations;  any 
protocol  built  around  the  following  assumptions  will  risk  losing  chains  of  influence  through 
malicious  processes. 

^Actually,  the  question  of  whether  Theorem  6  is  stronger  than  the  converse  of  Theorem  5  is  not  answered 
so  easily:  we  could  interpret  proving  the  latter  as  being  able  to  distinguish  a — from  a~—*b,  which  broaches 
the  awkward  topic  of  honest  processes  identifying  the  dishonest  ones.  Research  questions  remain  here. 

“Since  the  preparation  of  the  original  document  in  1991,  we  have  formalized  and  proved  this  conjecture. 
The  proof  will  appear  in  the  first  author’s  thesis.  [21] 
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•  the  processes  themselves  do  all  the  computation — nothing  is  hidden  or  unconscious 

•  no  honest  process  has  a  right  to  know  anything  about  the  internal  events  of  any  oilier 
process 

4.5.2.  Confinement  and  Efficiency 

Since  SVT  timestamps  arc  real  data  packets  which  entirely  determine  event  ordering,  the 
SVT  implementation  easily  enforces  backwtird  information  confinement.  A  process  exarnin 
ing  a  timestamp  does  not  need  to  bother  anyone  else.  However,  a  cursory  inspection  of  the 
protocol  reveads  a  a  fundamental  violation  of  forward  confinement:  the  fact  that  procersses 
must  pass  on  the  most  recent  timestamp  components  from  everyone  in  the  system. 

If  the  distribution  of  messages  is  fairly  uniform,  then  SVT  is  reasonably  efficient.  But  the 
real  world  contains  highly  non-uniform  scenarios.  For  example,  consider  a  system  consisting 
of  clusters  of  workstations  at  various  universities.  Most  of  the  communication  takes  place 
within  each  cluster,  so  the  system  graph  has  two  fairly  densely  connected  components,  iih 
only  a  few  edge  between  the  components.  If  we  have  n  processes  and  only  6  <  n  messages 
across  this  cut,  then  we’re  transmitting  much  extra  data — fl(<5n)  when  we  reallv  only  need 

0{P). 

One  can  argue  similarly  that  much  of  the  timestamp  information  in  a  lightly  coupled  clus¬ 
ter  is  irrelevant,  as  everyone  knows  everything  alreaniy.  This  situation  is  troublesome  because 
of  redundant  data,  rather  than  unnecessary  data.  Some  fairly  straightforward  methods  exist 
to  reduce  this  waste — consider  that  process  Pi  cam  obtain  from  the  timestamps  it  exchanges 
with  process  P2  a  good  lower  bound  for  each  component  in  Pj’s  internal  vector,  and  only 
needs  to  transmit  the  components  that  exceed  this  bound.® 

5.  Future  Work 

The  traditional  way  to  regard  time  is  as  a  linear  order  on  the  events  in  a  system — for  any 
pair  of  distinct  events  61,62,  one  must  have  happened  before  the  other.  By  deliberately 
leaving  unordered  events  that  did  not  influence  each  other,  the  BPO  opens  the  door  for 
more  general  classes  of  temporal  orderings. 

Besides  being  of  theoretical  interest  (e.g.,  Pratt  [17]),  these  alternative  time  models  have 
some  exciting  implications  for  asynchronous  distributed  systems.  Partial  orders  in  form  or 
another  lie  at  the  heart  of  many  application  problems.’®  For  example: 

•  Tracking  concurrency.  In  terms  of  the  partial  orders,  the  distril  uted  snapshot  prob- 

®After  the  1991  document,  we  discovered  that  Singhal  and  Kshemkalyani  (19)  had  previously  examined 
some  optimization  techniques  for  vector  timestamp  protocols. 

’°E.g.,  [1],(2],  [3], [4],  [8],  [10],  [16j.  See  [20]  or  [21]  for  a  more  thorough  overview. 


lera  reduces  to  finding  a  maximal  set  of  mutually  concurreni  e’  enis. 

•  Tracking  forward  influence.  The  problem  of  rollback  requires  delerniuung  the 
future  of  em  event;  if  event  ei  is  to  be  undone,  then  all  events  with  cj — >(2  must 
be  undone.  Protocols  based  on  linear  time  orders  only  detect  a  superset  of  what  cj 
influenced;  protocols  based  on  pwtial  orders  give  the  set  exactly. 

•  Tracking  reverse  influence.  The  problem  of  orphan  detection  requires  deterniining, 
given  event  Cj,  if  any  aborts  preceded  it.  Protocols  based  on  linear  time  orders  only 
detect  a  superset  of  what  influenced  ci;  protocols  baaed  on  partial  orders  give  the  set 
exactly. 

In  his  thesis  proposal  [20],  the  first  author  argues  that  solving  such  application  problems 
requires  first  solving  the  problem  of  maiintain  partial  order  information,  and  hence  these 
solutions  to  these  application  problems  will  automatically  inherit  the  security  problems  of 
partial  order  clocks.  Hence  developing  a  theory  of  partial  order  time  and  encapsulating 
its  clock  primitives  amd  security  issues  into  a  single  package  will  provide  a  framework  for 
building  secure  protocols  for  these  general  application  problems.  Forthcoming  publications 
will  expand  on  this  research. 
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